Exploiting Inter-Sample Information and Exploring Visualization in Data Mining: from Bioinformatics to Anthropology and Aesthetics Disciplines
نویسندگان
چکیده
In this data-overabundant world, revealing and representing comprehensible relationships behind complicated datasets have become important challenges in data mining. This chapter presents recent achievements in applying data mining techniques to two application areas— microarray analysis and anthropology study on Wi-Fi networks—and applies visualization techniques to help integrate heterogeneous databases to obtain useful data interpretation. As the amount of available microarray data has increased exponentially, integration of heterogeneous databases has become necessary. However, direct integration of microarrays is ineffective after normalization because of the diverse types of specific variations in experiments. This chapter reviews two approaches to overcome this issue, and introduces a cube model that combines and outperforms the two approaches by extracting information from yeast genes. This chapter will continue on a recent anthropology study which applies data mining to visualize urban Wi-Fi networks. In the past, artists could not store and handle with huge data without database, and therefore their works typically failed to communicate with other disciplines. Nonetheless, anthropologists have explored implicit relations and viewed them in multiple ways, and they have created a number of different category principles including binary opposition, functional structure and interpretation. These multiple principles can serve as the basis to visualize human thinking and reasoning in cross-cultural and interdisciplinary study. The anthropology study to be described in this chapter will relate Wi-Fi statics in complicated fieldwork databases with easily-understood cultural phenomena. Clear visualizations involve data aesthetics as well, which focuses on how to represent data in eye-conscious and categorized forms. This chapter will manifest that aesthetics can help expand existing data mining ideas to visual representations, with the example of combining the cube model introduced for bioinformatics data mining and the representation of regional Wi-Fi networks in spatial-temporal color charts.
منابع مشابه
Seriation and matrix reordering methods: An historical overview
Seriation is an exploratory combinatorial data analysis technique to reorder objects into a sequence along a one-dimensional continuum so that it best reveals regularity and patterning among the whole series. Unsupervised learning, using seriation and matrix reordering, allows pattern discovery simultaneously at three information levels: local fragments of relationships, sets of organized local...
متن کاملVisual matrix explorer for collaborative seriation
In this article, we present a web-based open source tool to support crossdisciplinary collaborative seriation with the following goals: to compare different matrix permutations, to discover patterns from the data, annotate it, and accumulate knowledge. Seriation is an unsupervised data mining technique that reorders objects into a sequence along a one-dimensional continuum to make sense of the ...
متن کاملAccompaniment of Natural and Artificial Urban Elements in the Creation of Urban Aesthetics (Case of Study: Isfahan City)
The objective of this article is to present the established components with regard to the visual quality and legibility of Isfahan city according to Lynch’s theory. Data was gathered in Isfahan mainly by way of observation and interviews. Additional information was obtained from historical data and urban documents. In the opinion of the citizens, the Zayandehrood River, as a natural urban compo...
متن کاملIdentification of the Patient Requirements Using Lean Six Sigma and Data Mining
Lean health care is one of new managing approaches putting the patient at the core of each change. Lean construction is based on visualization for understanding and prioritizing imporvments. By using only visualization techniques, so much important information could be missed. In order to prioritize and select improvements, it’s essential to integrate new analysis tools to achieve a good unders...
متن کاملDesign and Test of the Real-time Text mining dashboard for Twitter
One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...
متن کامل